---
title: "Introduction to renv"
author: "Kevin Ushey"
date: "`r Sys.Date()`"
output:
   rmarkdown::html_vignette:
      keep_md: true
      pandoc_args:
         - --columns=1000
vignette: >
  %\VignetteIndexEntry{Introduction to renv}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  eval     = FALSE
)
```

The `renv` package is a new effort to bring project-local R dependency
management to your projects. The goal is for `renv` to be a robust, stable
replacement for the [Packrat](https://rstudio.github.io/packrat/) package, with
fewer surprises and better default behaviors.

Underlying the philosophy of `renv` is that any of your existing workflows
should just work as they did before -- `renv` helps manage library paths (and
other project-specific state) to help isolate your project's R dependencies,
and the existing tools you've used for managing R packages (e.g.
`install.packages()`, `remove.packages()`) should work as they did before.


## Workflow

The general workflow when working with `renv` is:

1. Call `renv::init()` to initialize a new project-local environment with a
   private R library,

2. Work in the project as normal, installing and removing new R packages as
   they are needed in the project,

3. Call `renv::snapshot()` to save the state of the project library to the
   lockfile (called `renv.lock`),

4. Continue working on your project, installing and updating R packages as
   needed.

5. Call `renv::snapshot()` again to save the state of your project library if
   your attempts to update R packages were successful, or call `renv::restore()`
   to revert to the previous state as encoded in the lockfile if your attempts
   to update packages introduced some new problems.

The `renv::init()` function attempts to ensure the newly-created project
library includes all R packages currently used by the project. It does this
by crawling R files within the project for dependencies with the
`renv::dependencies()` function. The discovered packages are then installed
into the project library with the `renv::hydrate()` function, which will also
attempt to save time by copying packages from your user library (rather than
reinstalling from CRAN) as appropriate.

Calling `renv::init()` will also write out the infrastructure necessary to
automatically load and use the private library for new R sessions launched
from the project root directory. This is accomplished by creating (or amending)
a project-local `.Rprofile` with the necessary code to load the project when
the R session is started.

If you'd like to initialize a project without attempting dependency discovery
and installation -- that is, you'd prefer to manually install the packages
your project requires on your own -- you can use `renv::init(bare = TRUE)`
to initialize a project with an empty project library.


## Reproducibility

Using `renv`, it's possible to "save" and "load" the state of your project
library. More specifically, you can use:

- `renv::snapshot()` to save the state of your project to `renv.lock`; and
  
- `renv::restore()` to restore the state of your project from `renv.lock`.

For each package used in your project, `renv` will record the package version,
and (if known) the external source from which that package can be retrieved.
`renv::restore()` uses that information to retrieve and reinstall those
packages in your project.


### Caveats

It is important to emphasize that **renv is not a panacea for reproducibility**.
Rather, it is a tool that can help make projects reproducible by solving one
small part of the problem: it records the version of R + R packages being used
in a project, and provides tools for reinstalling the declared versions of
those packages in a project. Ultimately, making a project reproducible requires
some thoughtfulness from the user: what does it mean for a particular project to
be reproducible, and how can `renv` (and other tools) be used to accomplish
that particular goal of reproducibility?

There are a still a number of factors that can affect whether this project
could truly be reproducible in the future -- for example,

1. The results produced by a particular project might depend on other 
   components of the system it's being run on -- for example, the operating
   system itself, the versions of system libraries in use, the compiler(s)
   used to compile R and the R packages used, and so on. Keeping a 'stable'
   machine image is a separate challenge, but [Docker](https://www.docker.com/)
   is one popular solution. See also `vignette("docker", package = "renv")` for
   recommendations on how Docker can be used together with `renv`.

2. The R packages that the project depends on may no longer be available.
   If your project depends on R packages available on CRAN, it's possible those
   packages may be removed in the future -- either by request of the package
   maintainer, or by the maintainers of CRAN itself. This is quite rare, but
   needs consideration if reproducibility of a project is paramount.

In addition, be aware that package installation may fail if a package was
originally installed through a CRAN-available binary, but that binary is no
longer available. `renv` will attempt to install the package from sources in
this situation, but attempts to install from source can (and often do) fail due
to missing system prerequisites for compilation of a package. The
`renv::equip()` function may be useful in these scenarios, especially on Windows:
it will download external software commonly used when compiling R packages from
sources, and instruct R to use that software during compilation.

A salient example of this is the `rmarkdown` package, as it relies heavily on
the [`pandoc`](https://pandoc.org/) command line utility. However, because
pandoc is not bundled with the `rmarkdown` package (it is normally provided by
RStudio, or installed separately by the user), simply restoring an `renv`
project using `rmarkdown` may not be sufficient -- one also needs to ensure the
project is run in a environment with the correct version of `pandoc` available.


## Infrastructure

The following files are written to and used by projects using `renv`:

| **File**            | **Usage**                                                                           |
| -----------------   | ----------------------------------------------------------------------------------- |
| `.Rprofile`         | Used to activate `renv` for new R sessions launched in the project.                 |
| `renv.lock`         | The lockfile, describing the state of your project's library at some point in time. |
| `renv/activate.R`   | The activation script run by the project `.Rprofile`.                               |
| `renv/library`      | The private project library.                                                        |
| `renv/settings.dcf` | Project settings -- see `?settings` for more details.                               |

In particular, `renv/activate.R` ensures that the project library is made active
for newly launched R sessions. This ensures that any new R processes launched
within the project directory will use the project library, and hence are
isolated from the regular user library.
 
For development and collaboration, the `.Rprofile`, `renv.lock` and
`renv/activate.R` files should be committed to your version control system; the
`renv/library` directory should normally be ignored. Note that `renv::init()`
will attempt to write the requisite ignore statements to the project
`.gitignore`.


## Dependency Discovery

By default, `renv::snapshot()` will examine your project's R files to determine
which packages are used in your project, and will include only those packages
(alongside their recursive dependencies) in the lockfile. This is done via
a call to the `renv::dependencies()` function. We call this an "implicit"
snapshot, since the packages your project depends on are implicit based on how
packages appear to be used in your project. `renv` uses static analysis to
determine which packages appear to be used; e.g. by scanning your code for calls
to `library()` or `require()`.

While useful, this approach is not 100% reliable in detecting the packages
required by your project. If you find that `renv`'s dependency discovery is
failing to discover one or more packages used in your project, one escape hatch
is to include a file called `_dependencies.R` with code of the form:

```
library(<pkg>)
```


### Ignore Files

By default, `renv` reads the `.gitignore` files in your project (if any)
to infer which files should be ignored when scanning for dependencies.
If you find that `renv`'s dependency discovery is scanning files you don't
want to be scanned, you can use an `.renvignore` file to instruct `renv`
to ignore certain patterns of files in the project. For example, you might
use:

```
/data
```

to tell `renv` not to scan files within the `data` folder.

If you'd prefer that `renv` ignored all folders by default, except for some
subset of folders where you place your code files, you could use something
like:

```
*
!/code
```

In this case, `renv` will only scan your `code` folder at the root of the
project directory for dependencies.


### Explicit Snapshots

If you'd instead prefer to explicitly declare which packages are used in your
project, you can do so by creating a `DESCRIPTION` file at your project root.
These `DESCRIPTION` files should be formatted similarly to those used by
default in R package development -- see https://r-pkgs.org/description.html
for more details.

In this case, your `DESCRIPTION` file might look like:

```
Type: project
Description: My project.
Depends:
    tidyverse,
    devtools,
    shiny,
    data.table
```

The packages used in your project can be part of either the `Depends` or
`Imports` fields.


## Collaborating

When sharing a project with other collaborators, you may want to ensure everyone
is working with the same environment -- otherwise, code in the project may
unexpectedly fail to run because of changes in behavior between different
versions of the packages in use. `renv` can help to make such collaboration
easier -- see `vignette("collaborating", package = "renv")` for more details.


## Package Sources

`renv` is able to install and restore packages from a variety of sources,
including:

- [CRAN](https://cran.r-project.org/),
- [Bioconductor](https://www.bioconductor.org/),
- [GitHub](https://github.com/)
- [Gitlab](https://about.gitlab.com/)
- [Bitbucket](https://bitbucket.org/)

`renv` uses an installed package's `DESCRIPTION` file to infer its source. For
example, packages installed from the CRAN repositories typically have the field:

```
Repository: CRAN
```

set, and `renv` takes this as a signal that the package was retrieved from CRAN.


### Inferring Package Sources

The following fields are checked, in order, when inferring a package's source:

1. The `RemoteType` field; typically written for packages installed by the
  `devtools`, `remotes` and `pak` packages,

1. The `Repository` field; for example, packages retrieved from CRAN will
   typically have the `Repository: CRAN` field,
   
1. The `biocViews` field; typically present for packages installed from the
   Bioconductor repositories,

As a fallback, if `renv` is unable to determine a package's source from the
`DESCRIPTION` file directly, but a package of the same name is available in the
active R repositories (as specified in `getOption("repos")`), then the package
will be treated as though it was installed from an R package repository.

If all of the above methods fail, `renv` will finally check for a package
available from the _cellar_. See [here](cellar.html) for more details.
The package cellar is typically used as an escape hatch, for packages which do
not have a well-defined remote source, or for packages which might not be
remotely accessible from your machine.


### Unknown Sources

If `renv` is unable to infer a package's source, it will inform you during
`renv::snapshot()` -- for example, if we attempted to snapshot a package
called `skeleton` with no known source:

```
> renv::snapshot()
The following package(s) were installed from an unknown source:

        skeleton

renv may be unable to restore these packages in the future.
Consider reinstalling these packages from a known source (e.g. CRAN).

Do you want to proceed? [y/N]:
```

While you can still create a lockfile with such packages, `restore()` will
likely fail unless you can ensure this package is installed through some
other mechanism.


### Custom R Package Repositories

Custom and local R package repositories are supported as well. The only
requirement is that these repositories are set as part of the `repos` R
option, and that these repositories are named. For example, you might use:

```
repos <- c(CRAN = "https://cloud.r-project.org", WORK = "https://work.example.org")
options(repos = repos)
```

to tell `renv` to work with both the official CRAN package repository, as well
as a package repository you have hosted and set up in your work environment.


## Upgrading renv

After initializing a project with `renv`, that project will then be 'bound'
to the particular version of `renv` that was used to initialize the project.
If you need to upgrade (or otherwise change) the version of `renv` associated
with a project, you can use `renv::upgrade()`. This will install the
latest-available version of `renv` from your declared package repositories.
Alternatively, if you're currently using a development version of `renv` as
installed from GitHub in your project, then `renv` will install the
latest-available version of `renv` from GitHub.

With each commit of `renv`, we bump the package version and also tag the
commit with the associated package version. This implies that you can call,
for example:

```
renv::upgrade(version = "`r renv:::renv_package_version("renv")`")
```

to request the installation of that particular version of `renv` if so required.


## Cache

One of `renv`'s primary features is the use of a global package cache, which is
shared across all projects using `renv`. The `renv` package cache provides
two primary benefits:

1. Future calls to `renv::restore()` and `renv::install()` will become much
   faster, as `renv` will be able to find and re-use packages already installed
   in the cache.

2. Because it is not necessary to have duplicate versions of your packages
   installed in each project, the `renv` cache should also help you save
   disk space relative to an approach with project-specific libraries without
   a global cache.

To understand the `renv` cache, we need to first understand what an R _library_
is. An R library is, normally, a directory of installed R packages which can be
loaded and used within an R session. These are the directories reported by e.g.
`.libPaths()`, and R uses these directories when searching for packages to
load (e.g. in response to a call to `library(dplyr)`).

When using `renv` with the global package cache, the project library is instead
formed as a directory of symlinks (or, on Windows, junction points) into the
`renv` global package cache. Hence, while each `renv` project is isolated from
other projects on your system, they can still re-use the same installed packages
as required.

In some cases, `renv` will be unable to directly link from the global package
cache to your project library -- for example, if the package cache and your
project library live on different disk volumes. In such a case, `renv` will
instead copy the package from the cache into the project library.

By default, `renv` generates its cache in the following folders:

| **Platform** | **Location**                         |
| ------------ | ------------------------------------ |
| Linux        | `~/.local/share/renv`                |
| macOS        | `~/Library/Application Support/renv` |
| Windows      | `%LOCALAPPDATA%/renv`                |

If you'd like to share the package cache across multiple users, you can do so by
setting the `RENV_PATHS_CACHE` environment variable to a shared path. This
variable can be set in an R startup file to make it apply to all R sessions.
For example, it could be set within:

- A project-local `.Renviron`;
- The user-level `~/.Renviron`;
- A site-wide file at `$(R RHOME)/etc/Renviron.site`.

You may also want to set `RENV_PATHS_CACHE` so that the global package cache can
be stored on the same volume as the projects you normally work on. This is
especially important when working projects stored on a networked filesystem.

While we recommend enabling the cache by default, if you're having trouble with
`renv` when the cache is enabled, it can be disabled by setting the project
setting `renv::settings$use.cache(FALSE)`. Doing this will ensure that packages
are then installed into your project library directly, without attempting to
link and use packages from the `renv` cache.

If you find a problematic package has entered the cache (for example, an
installed package has become corrupted), that package can be removed with the
`renv::purge()` function. See the `?purge` documentation for caveats and things
to be aware of when removing packages from the cache.


## Installation from Source

In the end, `renv` still needs to install R packages -- either from binaries
available from CRAN, or from sources when binaries are not available.
Installation from source can be challenging for a few reasons:

1. Your system will need to have a compatible compiler toolchain available.
   In some cases, R packages may depend on C / C++ features that aren't
   available in an older system toolchain, especially in some older Linux
   enterprise environments.

2. Your system will need requisite system libraries, as many R packages contain
   compiled C / C++ code that depend on and link to these packages.

<!-- TODO: renv::equip() for Linux + macOS; use sysreqsdb -->


## Downloads

By default, `renv` uses [`curl`](https://curl.se/) for file downloads
when available. This allows `renv` to support a number of download features
across multiple versions of R, including:

- Custom headers (used especially for authentication),
- Connection timeouts,
- Download retries on transient errors.

If `curl` is not available on your machine, it is highly recommended that you
install it. Newer versions of Windows 10 come with a bundled version of
`curl.exe`; other users on Windows can use `renv::equip()` to download and
install a recent copy of `curl`. Newer versions of macOS come with a bundled
version of `curl` that is adequate for usage with `renv`, and most Linux package
managers have a modern version of `curl` available in their package
repositories.

`curl` downloads can be configured through `renv`'s configuration settings --
see `?renv::config` for more details.

If you've already configured R's downloader and would like to bypass `renv`'s
attempts to use `curl`, you can use the R option `renv.download.override`. For
example, executing:

```r
options(renv.download.override = utils::download.file)
```

would instruct `renv` to use R's own download machinery when attempting to
download files from the internet (respecting the R options
`download.file.method` and `download.file.extra` as appropriate). Advanced users
can also provide their own download function, provided its signature matches
that of `utils::download.file()`.

You can also instruct `renv` to use a different download method by setting
the `RENV_DOWNLOAD_METHOD` environment variable. For example:

```
# use Windows' internal download machinery
Sys.setenv(RENV_DOWNLOAD_METHOD = "wininet")

# use R's bundled libcurl implementation
Sys.setenv(RENV_DOWNLOAD_METHOD = "libcurl")
```

Note that other features (e.g. authentication) may not be supported when
using an alternative download file method -- you will have to configure
the downloader yourself if that is required. See `?download.file` for more
details.


### Proxies

If your downloads need to go through a proxy server, then there are a variety of
approaches you can take to make this work:

1. Set the `http_proxy` and / or `https_proxy` environment variables. These
   environment variables can contain the full URL to your proxy server,
   including a username + password if necessary.

2. You can use a `.curlrc` (`_curlrc` on Windows) to provide information about
   the proxy server to be used. This file should be placed in your home folder
   (see `Sys.getenv("HOME")`, or `Sys.getenv("R_USER")` on Windows);
   alternatively, you can set the `CURL_HOME` environment variable to point
   to a custom 'home' folder to be used by `curl` when resolving the runtime
   configuration file. On Windows, you can also place your `_curlrc` in the
   same directory where the `curl.exe` binary is located.

See the curl documentation on [proxies](https://everything.curl.dev/usingcurl/proxies.html)
and [config files](https://everything.curl.dev/cmdline/configfile.html) for more details.

As an [example](https://github.com/rstudio/renv/issues/146), the following
`_curlrc` works when using authentication with NTLM and SSPI on Windows:

```
--proxy "your.proxy.dns:port"
--proxy-ntlm
--proxy-user ":"
--insecure
```

The [curl](https://cran.r-project.org/package=curl) R package also has a helper:

```
curl::ie_get_proxy_for_url()
```

which may be useful when attempting to discover this proxy address.


## Authentication

Your project may make use of packages which are available from remote sources
requiring some form of authentication to access -- for example, a GitHub
enterprise server. Usually, either a personal access token (PAT) or username
+ password combination is required for authentication. `renv` is able to
authenticate when downloading from such sources, using the same system as the
[remotes](https://cran.r-project.org/package=remotes) package.
In particular, environment variables are used to record and transfer the
required authentication information.

| **Remote Source** | **Authentication**                      |
| ----------------- | --------------------------------------- |
| GitHub            | `GITHUB_PAT`                            |
| GitLab            | `GITLAB_PAT`                            |
| Bitbucket         | `BITBUCKET_USER` + `BITBUCKET_PASSWORD` |
| Git Remotes       | `GIT_PAT` / `GIT_USER` + `GIT_PASSWORD` |

These credentials can be stored in e.g. `.Renviron`, or can be set in your R
session through other means as appropriate.

If you require custom authentication for different packages (for example, your
project makes use of packages available on different GitHub enterprise servers),
you can use the `renv.auth` R option to provide package-specific authentication
settings. `renv.auth` can either be a a named list associating package names
with environment variables, or a function accepting a package name + record, and
returning a list of environment variables. For example:

```r
# define a function providing authentication
options(renv.auth = function(package, record) {
  if (package == "MyPackage")
    return(list(GITHUB_PAT = "<pat>"))
})

# use a named list directly
options(renv.auth = list(
  MyPackage = list(GITHUB_PAT = "<pat>")
))

# alternatively, set package-specific option
options(renv.auth.MyPackage = list(GITHUB_PAT = "<pat>"))
```

For packages installed from Git remotes, `renv` will attempt to use `git` from
the command line to download and restore the associated package. Hence, it is
recommended that authentication is done through SSH keys when possible.


### Authentication with Custom Headers

If you want to set arbitrary headers when downloading files using `renv`, you
can do so using the `renv.download.headers` R option. It should be a function
that accepts a URL, and returns a named character vector indicating the headers
which should be supplied when accessing that URL.

For example, suppose you have a package repository hosted at
`https://my/repository`, and the credentials required to access that repository
are stored in the `AUTH_HEADER` environment variable. You could define
`renv.download.headers` like so:

```r
options(renv.download.headers = function(url) {
  if (grepl("^https://my/repository", url))
    return(c(Authorization = Sys.getenv("AUTH_HEADER")))
})
```

With the above, `renv` will set the `Authorization` header whenever it attempts
to download files from the repository at URL `https://my/repository`.


## Shims

To help you take advantage of the package cache, `renv` places a couple of
shims on the search path:

| **Function**         | **Shim**          |
| -------------------- | ----------------- |
| `install.packages()` | `renv::install()` |
| `remove.packages()`  | `renv::remove()`  |
| `update.packages()`  | `renv::update()`  |

This can be useful when installing packages which have already been cached.
For example, if you use `renv::install("dplyr")`, and `renv` detects that
the latest version on CRAN has already been cached, then `renv` will just
install using the copy available in the cache -- thereby skipping some of
the installation overhead.

If you'd prefer not to use the `renv` shims, they can be disabled by setting
the option `options(renv.config.shims.enabled = FALSE)`. See `?config`
for more details.


## History

If you're using a version control system with your project, then as you call
`renv::snapshot()` and later commit new lockfiles to your repository, you may
find it necessary later to recover older versions of your lockfiles. `renv`
provides the functions `renv::history()` to list previous revisions of your
lockfile, and `renv::revert()` to recover these older lockfiles.

Currently, only Git repositories are supported by `renv::history()` and
`renv::revert()`.


## Comparison with Packrat

`renv` differs from Packrat in the following ways:

1. The `renv` lockfile `renv.lock` is formatted as [JSON](https://www.json.org/).
   This should make the lockfile easier to use and consume with other tools.

2. `renv` no longer attempts to explicitly download and track R package source
   tarballs within your project. This was a frustrating default that operated
   under the assumption that you might later want to be able to restore a
   project's private library without access to a CRAN repository. In practice,
   this is almost never the case, and the time spent downloading + storing the
   package sources seemed to outweigh the potential reproducibility benefits.

3. Packrat tried to maintain the distinction between so-called 'stale' packages;
   that is, R packages which were installed by Packrat but were not recorded
   in the lockfile for some reason. This distinction was (1) overall not useful,
   and (2) confusing. `renv` no longer makes this distinction:
   `snapshot()` saves the state of your project library to `renv.lock`,
   `restore()` loads the state of your project library from `renv.lock`, and
   that's all.

4. In `renv`, the global package cache is enabled by default. This should
   reduce overall disk-space usage as packages can effectively be shared
   across each project using `renv`.

5. `renv`'s dependency discovery machinery is more configurable. The function
   `renv::dependencies()` is exported, and users can create `.renvignore` files
   to instruct `renv` to ignore specific files and folders in their projects.
   (See `?renv::dependencies` for more information.)


## Migrating from Packrat

The `renv::migrate()` function makes it possible to migrate projects from
Packrat to `renv`. See the `?migrate` documentation for more details. In
essence, calling `renv::migrate("<project path>")` will be enough to
migrate the Packrat library and lockfile such that they can then be
used by `renv`.


## Uninstalling renv

If you find `renv` isn't the right fit for your project, deactivating and
uninstalling it is easy.

- To deactivate `renv` in a project, use `renv::deactivate()`. This removes
  the `renv` auto-loader from the project `.Rprofile`, but doesn't touch any
  other `renv` files used in the project. If you'd like to later re-activate
  `renv`, you can do so with `renv::activate()`.
  
- To remove `renv` from a project, use `renv::deactivate()` to first remove
  the `renv` auto-loader from the project `.Rprofile`, then delete the project's
  `renv` folder and `renv.lock` lockfile as desired.

If you want to completely remove any installed `renv` infrastructure components
from your entire system, you can do so with the following R code:

```
root <- renv::paths$root()
unlink(root, recursive = TRUE)
```

The `renv` package can then also be uninstalled via:

```
utils::remove.packages("renv")
```

Note that if you've customized any of `renv`'s infrastructure paths as described
in `?renv::paths`, then you'll need to find and remove those customized folders
as well.


## Future Work

`renv`, like Packrat, is designed to work standalone without the need to
depend on any non-base R packages. However, the following (future) integrations
are planned:

- Use [pak](https://github.com/r-lib/pak) for parallel package installation,

- Use [sysreqsdb](https://github.com/r-hub/sysreqsdb) to validate and install
  system dependencies as required before attempting to install the associated
  packages.

These integrations will be optional (so that `renv` can always work standalone)
but we hope that they will further improve the speed and reliability of `renv`.

